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An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise 
signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modiHcadon. The 
noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the bade- 
ground noise power spectral density based upon pre-processed spee<^ (215), as determined by the detected mSmigfi of the 
post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise 
supression system, or may be simulated by multiplying ihe pre-processed speedi energy (225) by the channel gain values of 
die modification signal (245)« The channel gain controller (240) produces these individual channel gain values for applica- 
tion to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value 
is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average back- 
ground noise leveL The technique of implementing post-processed signal to generate the background noise estimate (325) 
provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signaL 
As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significant- 
ly less voice quality degradation. 
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NOISE SUPPRBSSIOM SYSTEM 



Backqrround of the Invention 

1» Field of the Invention 

The present invention relates generally to 
05 acoustic noise suppression systems, and, more 

particularly, to an improved method and means for 
suppressing environmental backgrovmd noise from speech 
signals to obtain speech quality enhancement. 

10 .2. Description of the Prior Art 

Acoustic noise suppression systems generally 
serve the purpose of improving the overall quality of the 
desired signal by distinguishing the signal from the 

15 ambient background noise. More specifically, in speech 
communications systems, it is highly desirable to improve 
the signal-to-noise ratio (SNR) of the voice signal to 
enhance the quality of speech • This speech enhancement 
process is particularly necessary in environments having 

20 abnormally high levels of ambient background noise, such 
as an aircraft, a moving vehicle, or a noisy factory. 

A typical application for noise suppression is in 
a hearing aid. Environmental background noise is not 
only annoying to the hearing- impaired, but often 

25 interferes with their ability to understand speech. One 
method of addressing this problem may be foxind in U.S. 
Patent No. 4,461,025, entitled "Automatic Background 
Noise Suppressor." According to this approach, the 
speech signal is enhanced by automatically suppressing 

30 the audio signal in the absence of speech, auid increasing 
the audio system gain when speech is present. This 
variation of an automatic gain control (AGC) circuit 
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exaaines the incoming audio wavefon itself to detemine' 
if the desired speech component is present. 

A second method for enhancing the intelligiblity 
of speech in a hearing aid application is described in 
OS U.S. Patent No. 4,454,609. This technique emphasizes the 
spectral content of consonant sounds of speech to 
equalize the intensity of consonant sounds with that of 
vowel sounds. The estiaated spectral shape of the input, 
speech is used to modify the spectral shape of the actual 
10 speech signal so as to' produce an enhanced output speech 
signal. For example, a control signal may select one of 
a plurality of different filters having particularized 
frequency responses for modifying the spectral shape of 
the input speech signal, thereby producing an enhanced * 
15 consonauit output signal. 

A more sophisticated approach to a noise 
suppression system implementation is the spectral 
subtraction — or spectral gain modification — 
technique. Using this approach, the audio input signal 
20 spectrum is divided into individual spectral bands by a 
bank of ban^ass filten, and particular spectral bands 
are attenuated according to their noise energy content. 
A spectral subtraction noise suppression pref liter is 
described in R. J. McAulay and H. L. Malpass, "Speech 
25 Enhancement Using a Soft-Decision Noise Suppression 

Filter," IEEE Trans. Acoust. , Speech, Signal Processing , 
vol.' ASSP~28, no, 2, (April 1980), pp. 137-143. This 
prefilter utilizes eui estimate of the background noise 
power spectrea density to generate the speech Sim, which, 
30 in turn, is used to compute a gain factor for each 

individual channel. The gain factor is used as a pointer 
for a look-up table to determine the attenuation for that 
particular spectral band. The channels are then 
attenuated and reconbined to produce the noise-suppressed 
35 output waveform. 

However, in specialized applications involving 
relatively high background noise environments, a more 
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effective noise suppression technique is being sought. 
For exanple, some celliilar mobile radio telephone systems 
currently offer a vehicle speakerphone option providing 
hands-free operation for the automobile driver • The 
05 mobile hands-free microphone is typically located at a 
greater distance from the user, such as being mounted 
overhead on the visor. The more distant microphone 
delivers a much poorer signal-to-'noise level to the 
l2md-*end partgr due to road and wind noise within the 
10 vehicle. Although the received speech at the land end is 
usually intelligible, the high background noise level can 
be very emnoying. 

Although the aforementioned prior art techniques 
may perform sufficiently well under nominal background 
15 noise conditions, the performance of these approaches 
becomes severely limited when used vmder such high 
backgro\ind noise conditions. Utilizing typical noise 
suppression systems, the noise level over most of the 
audio band can be reduced by 10 dB without seriously 
20 affecting the voice quality. However, when these prior 
art techniques are used in relatively high background 
noise environments requiring noise suppression levels 
approaching 20 dB, there is a sxxbstantial degradation in 
voice quality. 

A need, therefore, exists for an improved 
acoustic noise suppression system which provides 
sufficient background noise attenuation in high ambient 
noise environments without significantly affecting the 
quality of the desired signal. 

Summary of the Invention 
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Accordingly, it is an object of the present 
invention to provide an improved method and apparatus for 
35 suppressing backgroxind noise in high background noise 
environments. 
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Another object of the present invention is to 
provide an improved noise suppression system for speech 
communication \^ich attains the optimum compromise 
between noise suppression depth and voice quality 
05 degradation. 

A more particular object of the present invention 
is to provide a noise suppression system particularly 
adapted for use in hands-free cellular mobile inadio 
telephone applications. 
10 A further object of the present invention is to 

provide a low-cost acoustic noise suppression system 
capable of being implemented in an eight-bit 
microcomputer. 

Briefly described, the present invention is an 
15 improved noise suppression system which performs speech 
quality enhancement by attenuating the background noise 
from a noisy pre-processed input signal — the 
speech-plus-noise signal available at the input of the 
noise suppression system — to produce a noise-suppressed 
20 post-processed output signal — the speech-minus-noise 
signal provided at the output of the noise suppression 
system — by spectral gain modification. The noise 
suppression system o't the present invention includes a 
means for separating the input signal into a plurality of 
25 pre-processed signals representative of selected 
frequency channels, and a means for modifying an 
operating parameter, such as the gain, of each of these 
pre-processed signals according to a modification signal 
to provide post-processed noise-suppressed output 
30 signaas. The means for generating the modification 
signal produces gain factors for each channel by 
automatically selecting one of a plurality of gain table 
sets in response to the overall average background noise 
level of the input signal, and by selecting one of a 
35 plurality of gain values from each gain table in response 
to the individual channel signal-to-noise ratio estimate. 
Thus, each individual channel gain value is selected, as a 
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function of (a) the channel number, (b) the current 
channel SNR estimate, and (c) the overall average 
background noise level* Accordingly, the noise 
suppression system of the present invention* utilizes 

05 post-processed signal energy ~ signal energy availeO^le 
at the output of the noise suppression system ~ to 
generate a modification signal to control the noise 
suppression parametezrs. It is these techniques of 
implementing the post-processed signal to generate the 

10 modification signal, and automatically selecting one of a 
plurality of gain table sets per the average overall 
background noise level, that allows the present invention 
to perform acoustic noise suppression in high ambient 
noise backgrounds with significantly less voice quality 

15 degradation. 

Brief Description of the Drawings 

The features of the present invention which are 
believed to be novel are set forth with particularity in 

20 the appended claims. The invention itself , however, 

together with further objects and advamtages thereof, may 
best be understood by reference to the following 
description when taken in conjunction with the 
accompanying drawings, in which: 

25 Figure 1 is a block diagram of a basic noise 

suppression system known in the fiurt which illustrates the 
spectral gain modification technique; 

Figure 2 is a block diagram of an alternate 
implementation of a prior art noise suppression system 

30 illustrating the channel filter-bank technicfue; 

Figure 3 is a block diagram of an improved 
acoustic noise suppression system employing the 
background noise estimation technique of the present 
invention; 
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Figure 4 is a block diagram of an alternate 
implementation of the present invention utilizing 
simulated post-processed signal energy to generate the* 
background noise estimate; 
05 Figure 5 is a detailed block diagram illustrating 

the preferred embodiment of the improved noise 
suppression system according to the present invention; 

Figure 6 is a flowchart illustrating the general 
sequence of operatipns performed in. accordance with the 
10 practice of the present invention; and 

Figure 7 is a detailed flowchart illustrating 
specific sequences of operations shown in Figure 6. 

Description of the Preferred Embodiment 

15 Referring now to the accompanying drawings. 

Figure 1 Illustrates the general principle of spectral 
subtraction noise suppression as known in the art« A 
continuous time signal containing speech plus noise is 
applied to input 102 of noise suppression system 100 • 

20 This signal is then converted to digital form by 

analog-to-digital converter 105. The digital data is 
then segmented into, blocks of data by the windowing 
operation (e.g., Hamming, Hamning, or Kaiser windowing 
techniques) performed by window 110. The choice of the 

25 window is similar to the choice of the filter response in 
an analog spectrum analysis. The noisy speech signal is 
then converted into the frequency domain by Fast Fourier 
Transform (FFT) 115. The power spectrum of the noisy 
speech signal is calculated by magxiitude squaring 

30 operation 120, and applied to background noise estimator 
125 and to power spectrum modifier 130. 

The backgroTind noise estimator performs two 
functions: (1) it determines when the incoming speech- 
plus-noise signal contains only background noise; axi€L (2) 

35 it updates the old background noise power spectral 
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density estimate when only background noise is present. 
The current estimate of the background noise power 
spectrum is subtracted from the speech'-plus-'noise power 
spectrum by power spectrtsm modifier 130, which ideally 

05 leaves only the power spectrum of cleem speech* The 
sqpiare root of the clean speech power spectrum is then 
calculated by magnitude square root operation 135. This 
magnitude of the clean speech signal is added to phase 
information 145 of the original signal, and converted 

10 from the frequency domain back into the time domain by 
Inverse Fast Fourier Transform (ZFFT) 140. The discrete 
data segments of the cleam speech signal are then applied 
to overlap«*and-add operation 150 to reisonstruct the 
processed signal « This digital signal is then 

15 re-converted by digital-to-analog converter 155 to an 
analog waveform available at output 158. Thus, an 
acoustic noise suppression system employing the spectral 
subtraction technique requires an accurate estimate of 
the cxxrrent background noise power spectral density to 

20 perform the noise camcellation fimction. 

One drawback of the Fourier Transform approach of 
Figure 1 is that it is a digital signal processing 
technique requiring considerable computational power to 
implement the noise suppression system in the frequency 

25 domain* Another disadvamtage of the FFT approach is that 
the output signal is delayed by the time recpaired to 
accumulate the samples for the FFT calculation* 

An alternate implementation of a spectral 
subtraction noise suppression system is the channel 

30 filter-bank technique illustrated in Figure 2. In noise 
suppression system 200, the speech-plus-noise signal 
available at input 205 is separated into a number of 
selected frequency chaxmels by channel divider 210* The 
gain of these individual pre-processed speech channels 

35 215 is then adjusted by channel gain modifier 250 in 

response to modification signal 245 such that the gain of 
the channels exhibiting a low speech-to-noise ratio is 
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reduced. The individual .channels comprising 
post-processed speech 255 are then recoobined in channel 
combiner 260 to form the noise-suppressed speech signal 
available at output 265. 

05 caxannel divider 210 is typically comprised of a 

number N of contiguous ban<^ass filters. The filters 
overlap at the 3 dB points such that the reconstructed 
output signal eadiibita less than 1 dB of ripple in the 
entire voice frequency range. In the present embodiment, 

10 14 Buttervorth bandpass filters are used to span the 
frequency range 250-3400 Hz., although any number and 
type of filters my be used. Also, in the preferred 
embodiment, the filter-bank of channel divider 210 is 
digitally implemented. This particular implementation 

15 will subsequently be described in Figures 6 and 7. 

Channel gain modifier 250 seanres to adjust the 
gain of each of the individual channels containing 
pre-prooessed speech 215. This modification is performed 
by multiplying the amplitude of the pre-processed ix^ut 

20 signal in a particular channel by its corresponding 

channel gain value obtained from modification signal 245. 
The channel gain modification function may readily be 
ia^iemented in software utilizing digital signal 
processing (DSP) techniques. 

25 Similarly, the summing function of chemnel 

combiner 260 may be implemented either in software, using 
DSP, or in hardware utilizing a summation circuit to 
combine the N post-processed channels into a single 
post-processed output signal. Hence, the channel 

30 filter-bank technique separates the noisy input signal 
into individual channels, attenuates those channels 
having a low speech-toonoise ratio, and recoiBbines the 
individual channels to form a low-noise output signal. 

^e individual channels comprising pre-processed 

35 speech 215 are also applied to channel energy estimator 
220 which serves to generate energy envelope veJ.ues 
Ei-Eif for each channel. These energy values, which 
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comprise channel energy estimate 225, are utilized by 
channel noise estimator 230 to provide an SHR estimate 
Xx^Xir for each channel. The SNR estimates 235 axe 
then fed to channel gain controller 240 which provides 

05 the individual channel gain values Gi^Gh comprising 
modification signal 245. 

Channel energy estimator 220 is comprised of a 
set of N energy detectors to generate an estimate of the 
pre«*processed signal energy in each- of the N channels. 

10 Each energy detector may consist of a full*vave 
rectifier, followed by a second-order Butterworth 
low-pass filter, possibly followed by another ftill-wave 
rectifier. The preferred embodiment of the invention 
utilizes OSP implementation techniques in software, 

IS although numerous other approaches may be used. An 

appropriate OSP algorithm is described in Chapter 11 of 
L* R. Rabiner and B. Gold, Theory and Application of 
Digital Signal Processing , (Prentice Ball, Englewood 
Cliffs, H.J^, 1975). 

20 Channel noise estimator 230 generates SNR 

estimates Xi-Xji by comparing the individual channel 
energy estimates of the current input signal energy 
(signal) to some type of current estimate of the 
background noise energy (noise) ^ This baOcground noise 

25 estimate may be generated by performing a channel energy 
measurement during the pauses in human speech. Thus, a 
background noise estimator continuously monitors the 
input speech signal to locate the pauses in speech such 
that the background noise energy can be measured during 

30 that precise time segment. A channel SNR estimator 
compares this background noise estimate to the input 
signal energy estimate to form signal-to-nolse estimates 
on a per-channel basis. In the present embodiment, this 
SNR comparison is performed as a software division of the 

35 channel energy estimates by the backgroiind noise 
estimates on an individual channel basis. 
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Chaxmel gain controller 240 generates the 
individual channel gain values of the modification signal 
245 in response to SNR estiaates' 235* One method of 
selecting gain values is to compare the SKR estimate with 
a preselected threshold, and to provide for \inity gain 

OS when the SNR estimate is below the threshold, while 
providing an increased gain above the threshold. A 
second approach is to compute the gain value as a 
function of the SNR estimate such that the gain value 
corresponds to a particular mathematical relationship to 

10 the SNR (i.e., linear, logarithmic, etc*). The present 
embodiment uses a third approacOi, that of selecting the 
channel gain values; from a channel gain table set 
comprised of empirically determined gain values. This 
approach will be fully described in conjunction with 

15 Figure 5. 

As noted above, the background noise estimate may 
be generated by performing a measurement of the 
pre-processed signal energy dxiring the pauses in humsm 
speech. Accordingly, the background noise estimator must 

20 accurately locate the paiises in speech by performing a 
speech/noise decision to control the time in which a 
background noise energy measurement is performed. 
Previous methods for making the speech/noise decision 
have heretofore been implemented by utilizing input 

25 signal energy — the signal-plus-noise energy, available 
at the input of the noise suppression system.. This 
practice of using the input signal places inherent 
limitations upon the effectiveness of any background 
noise estimation technique. These limitations are due to 

30 the fact that the energy characteristics of unvoiced 
speech sounds are very similar to the energy 
characteristics of background noise. In a relatively 
high background noise environment, the speech/noise 
decision process becomes very difficult and, 

35 consequently, the background noise estimate becomes 
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highly inaccurate. Thia inaccxiracy directly affects the 
performance of the noise suppression system as a whole. 

If, however, the speech/noise decision of the 
background noise estimate were based upon output signal 

05 energy — the signal energy available at the output of 
the noise suppression system — • then the accuracy of the 
speech/noise decision process would be greatly enhanced 
by the noise suppression system itself, in other words/ 
by utilizing post-proeessed speech — the speech energy 

10 available at the output of the noise suppression system 
the background noise estimator operates on a much 
cleaner speech signal such that a more accxurate 
speech/noise classification can be performed. The 
present invention teaches this unique concept of 

15 implementing post-processed speech signal to base these 
speech/noise decisions upon. Accordingly, more accurate 
determinations of the pauses in speech are made, and 
better performamce of the noise suppressor is achieved. 

This novel technique of the present invention is 

20 illustrated in Figure 3/ which shows a simplified block 
diagram of improved acoustic noise suppression 
system 300 « Channel divider 210, channel gain modifier 
250, chaimel combiner 260, channel gain controller 240, 
and channel energy estimator 220 remain unchanged from 

25 noise suppression system 200. However, channel noise 
estimator 230 of Figure 2 has been replaced by channel 
SHR estimator 310, background noise estimator 320, and 
chaxmel energy estimator 330. in combination, these 
three elements generatie SNR estimates 235 based upon both 

30 pre-processed speech 215 and post-processed speech 255. 

Operation and construction of chaimel energy 
estimator 330 is identical to that of channel energy 
estimator 220, with the exception that post-processed 
speech 255, rather than pre-processed speech 215, is 

35 applied to its input. The post-processed channel energy 
estimates 335 are used by background noise estimator 320 
to perform the speech/noise decision. 
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In generating background noise estimate 325, two 
basic functions must be performed. First, a 
determination must be made as to when the incoming 
speecb-plus«*noise signal contains only backgroxmd noise 

05 — during the pauses in human speech. This speech/ noise 
decision is performed by periodically detecting the 
minima of post-processed speech signal 255, either on an 
individual chamnel basis or an overall combined-channel 
basis. Secondly, the speech/noise decision is utilized 

10 to control the time at which the background noise energy 
measurement is taken, thereby providing a mechemism to 
update the old background noise estimate. A background 
noise estimate is performed by generating and storing an 
estimate of the background noise energy of ' pre-processed 

15 speech 215 provided by pre-processed channel energy 

estimate 225. Numerous methods may be used to detect the 
minima of the post-processed signal energy, or to 
generate and store the estimate of the background noise 
energy based upon the pre-processed signal. 'The 

20 particular approach used in the present embodiment for 
performing these functions will be described in 
conjunction with Figure 6. 

channel SNR estiioator 310 compares background 
noise estimate 325 to channel energy estimates 225 to 

25 generate SNR estimates 235. As previously noted, this 
SNR comparison is performed in the present eaibodiment as 
a software division of the channel energy estimates 
(signal -plus-noise) by the background noise estimates 
(noise) on an Jjidividual chainnel basis. SNR estimates 

30 235 are used to select particular gain values from a 
channel gain table comprised of empirically determined 
gains. 

It is this method of more accurately controlling 
the time at which the background noise measurement is 
35 performed, by basing the time determination upon 
post-processed speech energy, that provides a more 
accurate measurement of the pre-processed speech for the 
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background noise estimate* Consequently, the performance 
of the entire noise suppression system is improved by 
deriving the speech/noise decision from post-processed 
speech* 

05 Figure 4 is an alternate implementation of the 

present invention illustrating hov the post'-processed 
speech energy, used by the background noise estimator, 
may be obtained in a different manner. Post-processed 
speech energy may be simulated** by multiplying 

10 pre-processed channel energy estimates 225, obtained from 
channel energy estimator 220, by the channel gain values 
of modification signal 245, obtained from channel gain 
controller 240. This multiplication is performed on a 
per-channel basis in background noise estimator 420, 

15 thereby providing a pltirality of background noise 
estimates 325 to channel SMR estimator 310* In the 
present einbodiment, this multiplication process is 
performed by an energy estimate modifier incorporated in 
background noise estimator 42 0« Alternatively, this 

20 simulated post*processed speech may be provided by an 
ejctemal multiplication block, or by other modification 



30 



The advantage of providing simulated 
post-processed speech energy to the background noise 
25 estimator is that a second chauinel energy estimator (320) 
is no longer required. Channel energy estimator 220 
provides pre-processed speech energy estimates 225 for 
each channel which, when multiplied by the individual 
channel gain factors, represent post-processed speech 
energy estimates 335 normally provided by post-processed 
channel energy estimator 330. Therefore, the function of 
one channel energy estimator block may be saved at the 
expense of some type of energy estimate modification 
block. Depending on the system configuration and 
35 implementation, the advantage of using simulated 

post-processed speech (provided by a modification block) 



BNSDOCIO: <WO__87D0366A1_»_> 



wo 87/00366 




PCT/US86/00990 



- 14 



versus post-processed speech (obtained directly from the 
output) may be significant* 



preferred embodiment of the present invention. Improved 

05 noise suppression system 500 incorporates numerous useftil 
noise suppression techniques: (a) the chaxmel filter-bank 
noise suppression technlqpae illustrated in Figure 2; (b). 
the simulated post-processed speedi energy technique for 
background noise estimation as shown in Figure 4; (c) the 

10 energy valley detector technique for performing the 
speech/noise decision; (d) a novel technique for 
selecting gain values from multiple gain tables according 
to overall background noise level; and (e) a new method 
of smoothing the gain factors on a per-sample basis. 

15 Referring now to Figxxre 5^ analog-to-dlgltal 

converter 510 samples the noisy speech signal at input 
205 every 125 microseconds. This digital signal is then 
applied to pre-emphasls filter 520 which provides 
approximately 6 dB per-octave pre-emphasls to the signal 

20 before it is separated into channels. Pre-emphasis is 

used because both high frequency noise and high frequency 
voice components are normally lower in energy level as 
compeured to low frequency noise emd voice* The 
pre-emphaslzed signal is then applied to channel divider 

25 210^ which separates the input signal into N signals 

representative of selected frequency channels. These N 
channels comprising pre-processed speech 215 aire then 
applied to channel energy estimator 220 and chaimel gain 
modifier 250, as previously described. After gain 

30 modification, the individual channels comprising 

post-processed speech 255 are summed by ch£uinel combiner 
260 to form a single post-processed output signal. This 
signal is then de«emphaslzed at approximately 6 dB 
per-octave by de-eo^hasis network 540 before being 
re«-converted to an analog waveform by digltal-to-analog 

35 converter 550. The noise-suppressed (clean) speech 
signal Is then available at output 265. 



Figure 5 is a detailed block diagram of the 
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The energy in each of the N channels is measured 
by chsumel energy eetiaator 220 to produce channel energy 
estimates 225. These energy envelope values are applied 
to three distinct blocks « First, the pre-processed 

05 signal energy estimates are multiplied by rav channel 
gain values 535 in energy estimate modifier 560. This 
multiplication serves to simulate post«processed energy . 
by performing essentially the same function as channel 
gain modifier 250 — except on a channel energy level 

10 rather than on a channel signal level. The individual 
simulated post-processed channel energy estimates from 
energy estimate modifier 560 are applied to channel 
energy combiner 565 which provides a single -overall 
energy estimate for energy valley detector 570 • Channel 

15 energy combiner 565 may be omitted if multip;.e valley 
detectors are utilized on a per-channel basis and the 
valley detector output signals are combined. 

Energy valley detector 570 utilizes the overall 
energy estimate from combiner 565 to detect the pauses in 

20 speech. This is accomplished in three steps. First, an 
initial valley level is established* If backgroimd noise 
estimator 420 has not previously been initialized, then 
an initial valley level is created which would correspond 
to a high background noise environment, otherwise, the 

25 previous valley level is maintained as its post-processed 
background noise energy history* Next, the previous (or 
initialized) valley level is updated to reflect current 
background noise conditions. This is accomplished by 
comparing the previous valley level to the single overall 

30 energy estimate from combiner 565. A current valley 

level is formed by this updating process, which will be 
described in detail in Figure 7. The third step 
performed by energy valley detector 570 is that of making 
the actual speech/noii^e decision. 'A preselected valley 

35 offset is added to the updated current valley level to 

produce a noise threshold level. Then the single overall 
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pos^-processed energy estimate Is again compared, only 
this time to the noise threshold level. When this energy 
estimate is less than the noise threshold level, energy 
valley detector 570 generates a speech/noise control 
05 signal (valley detect signal) indicating that no voice is 
present. 

The second use for pre-processed energy estimates 
225 is that of updating the background noise estimate. 
During the pauses in the simulated post-processed speech 

10 signal, as determined by a positive vaaiey detect signal 
from energy valley detector 570, channel switch 575 is 
closed to allow pre-processed speech energy estimates 225 
to be applied to smoothing filter 580. The smoothed 
energy estimates at the output of smoothing filter 580 

15 are stored in energy estimate storage register 585. 
Elements 580 and 585, connected as shown, form a 
reciirsive filter which provide a time-averaged value of 
. each individual speech energy estimate. This smoothing 
ensures that the current backgroiind noise estimates 

20 reflect the average background noise estimates stored in 
storage register 585, as opposed to the instantaneous 
noise energy estimates available at the output of switch 
575. Thus, a very accurate background noise estimate 325 
is continuously available for use by the noise 

25 suppression systemi. 

If no previous background noise estimate exists 
in energy estimate storage register 585, the register is 
preset with an initialization value representing a 
backgrowd noise estimate approximating that of a low 

30 noise input. 

Initially, no noise suppression is being 
performed. As a result, energy valley detector 570 is 
performing speech/noise decisions on speech energy which 
has not yet been processed. Eventually, valley detector 

35 570 provides rough speech/noise decisions to activate 
channel switch 575, which causes the initialized 
background noise estimate to be updated. As the 



wo 87/00366 PCT/US86/00990 

. 17 - 

backgroxmd noise estimate is updated, the noise 
suppressor begins to process the input speech energy by 
suppressing the backgroiind noise* Consequently, the 
post^-processed speech energy exhibits a slightly greater 

05 signal-to-noise ratio for the valley detector to utilize 
in making more accurate speech/noise classifications « 
After the system has been in operation for a short period 
of time (e.g., 100*500 milliseconds), the valley detector 
is operating on an improved SNR speech signal. Thus, 

10 reliable speech/noise decisions control switch 575, 

which, in turn, permit energy estimate storage register 
585 to very accurately reflect the backgrotmd noise power 
spectrum. It is this "bootstrapping technique" — 
updating the initialization values with more accurate 

15 background noise estimates ~ that allows the present 
invention to generate very accurate background noise 
estimates for an acoustic noise suppression system. 

The third use for pre-processed channel energy 
estimates 225 is for application to channel SNR estimator 

20 310. As previously noted, these estimates represent 
signal-plus-noise for comparison to background noise 
estimate 325, representing noise only. This 
signaiL-to-noise cos^arison is performed as a software 
division in channel SNR estimator 310 to produce chazmel 

25 SNR estimates 235. These SNR estimates are used to 
select particular cheumel gain values comprising 
modification signal 245. 

Gain tables generally provide nonlinear mapping 
between the channel SNR inputs Xi-X^ and the channel 

30 gain outputs Gi-Gh. a gain table is basically a two- 
dimensional array of empirically-determined gain values. 
These channel gain values are typically selected as a 
fvmction of two Variables: (a) the Individual channel 
number N; and (b) the individual SNR estimate Xk« When 

35 voice is present in an individual channel, the channel 

signal-to-noise ratio estimate will be high. A large SNR 
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estimate Xh would result in a channel gain value % 
approaching a naximum valuis (i.e., i in the present 
einbodiaent) . The amount o£ the gain rise may be designed 
to be dependent i^n the detected SNR — the greater the 
05 SNR, the more the individual chazmel gain will be raised 
from the base gain . (all noise) . if only noise is present 
in the individual chazmel, the SNR estimate will be low, 
and the gain for that chiumel will be reduced, 
approaching a minimum base gain value (i.e., 0 ). Voice 
10 energy does not appear in all of the Channels at the same 
time, so the chemnels containing a low voice energy level 
will be suppressed from the voice energy spectrum. 

However, in unusually high background noise 
environments requiring "noise suppression levels of 
15 approximately 20 dB, different noise suppression gain 
factors must be chosen, to correspond to such levels. 
Furthermore, in certain applications exhibiting changing 
noise environments, the gain factors ehoam for one 
background noise level may significantly degrade the 
20 voice quality when used with a different background noise 
level. This problem is particularly evident in 
automobile environments where ina]^ropriate gain factors 
can cause a loss of low frequency voice components, which 
makes voices sound "thin" under high noise suppression. 
23 The present embodiment solves this problem by 

selecting the channel gain values as a f\anction of three 
varieOsles by channel gain controller 240. The first 
variable is that of individual channel number 1 through 
H, such that a low frequency channel gain factor may be 
30 selected independently from that of a high frequency 
channel. The second variable is the individual channel 
SMR estimate. These two variables perform the buis of 
spectraa gain modification noise suppression, since the 
individual channels containing a low signal-to-noise 
35 ratio estimate will be suppressed from the voice 
spectrum. 
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The tblrd variable is that of overall average 
background noise level of the input signal. This third 
variable permits automatic selection of one of a 
plurality of gain tables, each gain table containing a 

05 set of empirically determined channel gain values which 
can be selected as a function of the other two variables. 
This gain table selection technique allows a wider choice 
of channel gain values, depending on the particular 
backgroiind noise environment. For example, a separate 

10 gain table set with different nonlinear relationships 

between the low frequency and high frequency gain values 
may be desired in a particular background noise 
environment, allowing the noise*suppressed speech to 
sound more normal. This technique is particularly useful 

15 in automobile environments, where a loss of low frequency 
voice components makes voices sound thin under high noise 
suppression. 

Again referring to Figure 5, the overall average 
backgroiind noise level is determined by applying the 

20 current valley level 525 from energy valley detector 570 
to noise level quantizer 555. The output of quantizer 
555 is used to select the appropriate gain table set for 
the given noise environment. Noise level quantization is 
required since the current valley level is a continuously 

25 varying parameter, whereas only a discrete number of gain 
table sets are available from which to choose gain 
values. Noise level quantizer 555 utilizes hysteresis to 
determine a particular gain table set from a range of 
current valley levels, as opposed to a static (strictly 

30 lineeur) threshold selection mechanism. 

The gain tsJ^le selection signal, output from 
noise level quantizer 555, is applied to gain tsLble 
switch 595 to inplenent the gain table selection process. 
Accordingly, one of a plurality of gain table sets 590 

35 may be chosen as a function of overall average background 
noise level. Each gain table set has selected individual 
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channel gain values corresponding to various individual 
channel SMR estimates 235. In the present embodiment, 
three gain table sets are utilized, representing low, 
medium, or high baclcground noise levels. However, any 

05 number of gain table sets may be used and any 

organization of channel gain veilues may be iatplemented. 

The raw channel gain values 535, available at the 
output of switch 595, are applied to gain smoothing 
filter 530 and to energy estimate modifier 560. As noted 

10 above, these raw gain values are used by energy estimate 
modifier 560 to produce simulated post-processed speech 
energy estimates. 

Gain smoothing filter 530 provides smoothing of 
raw gain values 535 on a per-sample basis for ea^ 

15 individtxea chazmel. This per->saiq?le smoothing of the 
noise suppression gain factors significantly improves 
• noise flutter performance caused by step discontinuities 
in frame-to- frame gain changes. Different time constants 
for- each channel are used to compensate for the different 

20 gain table sets employed. The gain smoothing filter 

algorithm will be described later. These smoothed gain 
values comprise modification signal 245 which is applied 
to channel gain modifier 250. As previously described, 
the channel gain modifier performs spectral gain 

25 modification noise suppression by reducing the relative 
gain of the noisy channels. 

Figure 6a/b is a flowchart illustrating the 
overall operation of the present invention. The 
flowchart of Figure 6a/b corresponds to improved noise 

30 suppression system 500 of Figure 5. This generalized 
flow diagram is subdivided into three fimctional blocks: 
noise suppression loop 604 — further described in detail 
in Figtira 7a; automatic gain selector 615. ~ described in 
more detail in Figure 7b; and automatic background noise " 

35 estimator 621 — illustrated in Figures 7c and 7d. 
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The operation of the improved noise suppression 
system of the present invention begins with Figure 6a at 
initialization block 601. When the system is first 
povered^up, no old backgroiind noise estimate exists in 

05 energy estimate storage register 585, and no noise energy 
history exists in energy valley detector 570 « 
Consequently, diaring initialization 601, storage register 
585 is preset with an initialization value representing a 
background noise estimate value corresponding to a clean 

10 speech signal at the input* Similarly, energy valley 
detector 570 is preset with an initialization value 
representing a valley level corresponding to a noisy 
speech signal at the input« 

Initialization block 601 also provides initial 

15 sample counts, chauinel counts, and frame counts. For the 
purposes of the following discussion, a sample period is 
defined as 125 microseconds corresponding to an 8 KHz 
sampling rate. The frame period is defined as being a 10 
millisecond duration time interval to which the input 

20 signal samples are quantized. Thus, a frame corresponds 
to 80 samples at am 8 KHz sampling rate« 

Initially, the sample count is set to zero. 
Block 602 increments the sample count by one, and a noisy 
speech saa^le is input from A/O converter 510 in block 

25 603. The speech saunple is then pre-emphasized by 
pre-emphasis network 520 in block 605. 

Following pre-*emphasis, block 606 initializes the 
channel coxint to one. Decision block 607 then tests the 
channel count number. If the chamnel count is less than 

30 the highest channel niomber N, the sample for that channel 
is b2mdpass filtered, emd the signal enexrgy for that 
channel is estimated in block 608. The result is saved 
for later ixse. Block 609 smoothes the raw channel gain 
for the present channel, and block 610 modifies the level 

35 of the bandpass-filtered sample utilizing the smoothed 
channel gain* The N fennels are then combined (also in 
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block 610) to form a single processed output speech 
sample. Block 611 increments the channel co\mt by one 
and the procedure in blocks 607 through 611 is repeated. 

If the result of the decision in 607 is true, the 
05 combined sample is de-es^hasized in block 612 and output 
as a modified speech sample in block 613. The sample 
count is then tested in block 614 to see if all samples 
in the current frame have been processed. If sauries 
remain, the loop consisting of blocks 602 through 613 is 
10 re-entered for another sasqple. If all samples in the 
current frame have been processed, block 614 initiates 
the procedure of block 615 for updating the individual 
channel gains. 

continuing with Figure 6b, block 616 initiates 
15 the channel counter to one. Block 617 tests if all 
channels have been processed. If this decision is 
negative, block 618 calculates the index to the gain 
table for the particular channel by forming an SNR 
estimate. Ihia index is then utilized in block 619 to 
20 obtain a cdiamnel gain value from the look-up table. The 
gain value is then stored for use in noise si^pression 
loop 604. Block 620 then increments the channel counter, 
and block 617 reeheeks to see if all channel gains have 
been updated. If this decision is affirmative, the 
25 backgrotand noise estimate is then updated in block 621. 

To update the background noise estimate, the 
present invention first simulates post-processed energy 
in block 622 by multiplying the updated raw channel gain 
value by the pre-processed energy estimate for. that 
channel. Next, the simulated post-processed energy 
estimates are combined in block 623 to form an overall 
channel energy estimate for use by the valley detector. 
Block 624 compares the value of this overall 
. post-processed energy estimate to the previous valley 
35 level. If the energy value exceeds the previous valley 
level, the previous valley level is updated in block 626 
by increasing the level with a slow time constant. This 
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occurs vhen voice, or a higher background noise level, is 
present. If the output of decision block 624 is negative 
(post^processed energy less them previous valley level) , 
the previous valley level is updated in block 625 by 

05 decreasing the level with a fast time constatnt* This 
previous valley level decrease occurs vhen Binimal 
background noise is present* Accordingly, the background 
noise history is continually updated by slowly increasing 
or rapidly decreasing the previous valley level towards 

10 the current post*processed energy estimate. 

subsequent to the updating of the previous valley 
level (block 625 or 626) > decision block 627 tests if the 
current post*processed energy value exceeds a 
predetermined noise threshold. If the result of this 

15 comparison is negative, a decision that only noise is 
present is made, and the background noise spectral 
estimate is updated in block 628. This corresponds to 
the closing of channel switch 575. If the restat of the 
test is affixmative, indicating that speech is present, 

20 the background noise estimate is not updated. In either 
case, the operation of background noise estimator 621 
ends when the sample count is reset in block 629 and the 
frame count is incremented in block 630. Operation then 
proceeds to block 602 to begin noise suppression on the 

25 next frsuae of speech. 

The flowchart of Figure 7a illustrates the 
specific details of the sequence of operation of noise 
suppression loop 604. For every sample of input speech, 
block 701 pre-*emphasizes the sample by implementing the 

30 filter described by the eqpiation: 

Y (nT) «X{nT) -Ki [X( (n-1) T) ] 
where Y(nT) is the output of the filter at time nT, T is 
the sas^le period, X(nT) and X((n«»l)T) are the input 
samples at times nT and (n«l)T respectively, and the 

35 pre-emphasis coefficient Ki is 0.9375. As previously 
noted ^ this filter pre-emphasizes the speech sample at 
approximately ^6 dB per^octave. 



wo 87/00366 PCT/US86/00990 




Block 702 sets the channel coxmt equal to one, 
and Initializes the output sample total to zero. Bloclc 
703 tests to see if the channel count is equal to the 
total niainber of channels. N» If this decision is 
05 negative, the noise suppression loop begins by filtering 
the speech sample through the bandpass filter 
corresponding to the present channel count* As noted 
earlier, the bandpass filters are digitally implemented 
using DSP techniques such that they function as 4-pele 
10 Butterworth bandpass filters. 

The speech sample output from bandpass filter (cc) 
is then full-wave rectified in block 705, and low-pass |p 
filtered in block 706, to obtain the energy envelope 
value E(cc) this particular sample. This channel 
15 energy estimate is then stored by block 707 for later 
use* As will be apparent to those skilled in the art, 
energy envelope value B(cc) is actually an estimate of 
the squasre root of the energy in the channel. 

Blook 708 obtains the raw gadLn value R6 for 
20 channel cc and performs gain smoothing by means of a 
first order IIR filter, implementing the equation: 

G (nT) «6 ( (n-1) T) +K2 (cc) (RG (nT) -6 (n-1) T) 
where G(nT) is the smoothed channel gain at time nT, T is 
the sample period, G((n-1)T) is the smoothed channel gain 
25 at time (n-l)T, RG(nT) is the computed raw channel gain W 
for the last frame period, amd K2 (cc) is the filter 
coefficient for channel cc. This smoothing of the raw 
gain values on a per-sample basis reduces the 
. discontinuities in gain changes, thereby significantly 
30 improving noise flutter performance. 

Block 709 multiplies the filtered sample obtained 
in block 704 by the smoothed gain value for chazmel cc 
obtained from block 708. This operation modifies the 
level of the bandpass filtered sample using the current 
35 channel gain, corresponding to the operation of channel 
gain modifier 250. Block 710 then adds the modified 
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filter sample for chsmnel cc to the output sample total, 
which, when performed N times, combines the N modified 
bandpass filter outputs to form a single processed speech 
sample output. The operation of block 710 corresponds to 

05 channel combiner 260. Block 711 increments the chamiel 
count by one and the procedtire in blocks 703 through 711 
is then repeated. 

If the result of the test in 703 is true, the 
output speech sample is de-emphasized at approximately 

10 -6 dB per-octave in block 712 according to the equation: 

Y (nT) »X (nT) +K3 C Y ( (n-1) T) ] 
where X(nT) is the processed sample at time nT, T is the 
sample period, Y(nT) and Y((n-1)T) are the de-emphasized 
speech samples at times nT and (n-l)T respectively, and 

15 K3 is the de-*esiphasis coefficient which has a value of 
0.9375. The de-emphasized processed speech sample is 
then output to the D/A converter block 613. Thus, the 
noise suppression loop of Figure 7a illustrates both the 
channel filter-bank noise suppression technique and the 

20 per-sample chazmel gain smoothing technicpie. 

The flowchart of Figure 7b more rigorously 
describes the detailed operation of automatic gain 
selector block 615 of Figure 6. Following processing of 
all speech samples in a particular frame, the operation 

25 is turned over to block 615 which serves to update the 
individual channel gains. First of all, the channel 
count (cc) is set to one in block 720. Next, decision 
block 721 tests if all channels have been processed. If 
not, operation proceeds with block 722 which calculates * 

30 the signal-to-noise ratio for the particular channel. As 
previously mentioned, the SMR calculation is isimply a 
division of the per-channel energy estimates ( signal- 
plus-noise} by the per-channel background noise . estimates 
(noise) ♦ Therefore, block 722 simply divides the current 

35 stored channel energy estimate from block 707 by the 
current background noise estimate from block 628 
according to the equation: 
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Index (cc) » [current frame energy for channel cc] / 
[background noise estimate for channel cc] » 
The current valley level, 525 of Figure 5, is 
then quantized in block 723 to produce a digital gain 

05 table selection signal from an analog valley level. 

Hysteresis is tised in quantizing the valley level, since 
the gain table selection signal shoiild not be responsive 
to mini m al chsmges in ctirrent valley level. 

In block 724, the particular gain table to be 

10 indexed is chosen* In the present embodiment, the 

quantised value of the current valley level generated in 
block 723 is used to perform this selection* Bovever, 
any method of gain table selection may be used. 

The SKR index calculated in block 722 is used in 

15 block 725 to look up the raw channel gain value from the 
appropriate gain table. Hence, the gain value is indexed 
as a function of three variables: (l) the channel number; 
(2) the current channel SNR estimate; and (3) the overall 
average background noise level. The raw gain value is 

20 then obtained in block 726 according to this 
three-variable index. 

Block 727 stores the raw gain value obtained in 
block 726. Block 728 then increments the channel count, 
and decision block 721 is re-entered. After all N 

25 channel gains have been updated, operation proceeds to 
block 621 to update the current valley level ud the 
current background noise estimate. Hence, automatic gain 
selector block 615 updates the channel gain values on a 
frame-by-frame basis as a function of the overall average 

30 backgroxmd noise level to more accurately generate noise 
suppression gain factors for each particular channel. 

Figure 7c and Figure 7d es^ands upon block 721 to 
more specifically describe the function of automatic 
background noise estimator 420 of Figure 5. 

35 Particularly, Figure 7c describes the process of 
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sinulatlng the post-processed energy and conblnlng these 
estiaates, while Figure 7d describes the operation of 
valley detector 570. 

Referring now to Figure 7c, the operation for 

05 simulating post-processed speech begins at block 730 by 
setting the channel count (cc) to one. Block 731 tests 
this channel count to see if all N channels have been 
processed. Zf not, the equation of block 732 describes 
the actual siaulation process performed by energy 

10 estimate modifier 560 of Figure 5. 

Simulated post-processed speech energy is 
generated by multiplying the raw channel gain values 
(obtained directly from the channel gain tables) by the 
pre-processed energy estimate (obtained from channel 

15 energy estimator 220) for each channel via the equation: 

SE(cc)=E(cc) RG(cc) 
where SE(cc) is the simulated post-processed energy for 
channel cc, E(cc) is the current frame energy estimate 
for channel .cc stored by block 707, and RG(cc) is the raw 

20 channel gain value for channel ee obtained from block 

725. As noted earlier, E(cc) is actually the square root 
of the energy in the channel since it is a measure of the 
signal envelope. Hence, the R6(ce) term of the above 
equation is not squared. The multiplication performed in 

25 block 732 serves essentially the same function as channel 
gain modifier 250 ~ except that the channel gain 
modifier utilizes pre-processed speech signal whereas 
energy estimate modifier 560 utilizes pre-processed 
speech energy . (See Figure 5) . 

30 The channel counter is then incremented in block 

733, and retested in block 731. When a simulated 
post-processed energy value is obtained for all N 
channels, blocks 734 through 738 serve to combine the 
individual simulated channel energy estimates to form the 

35 single overall energy estimate according to the equation: 
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POST-FSOCESSED ENERGY <- 
N 

m CBANMSL(l) POST-FROCESSEO ENERGY 

05 Where H is the ninnber of filters in the filter-bank. 

Block 734 initializes the channel count to one, 
and block 735 initializes the overall post-processed 
energy value to zero. After initialization, decision 
block 736 tests whether or not all channel energies have 
10 been combined. If not, block 737 adds the simulated 
post-processed energy value for the current channel to 
the overall post-processed energy value. The current 
channel nuiaber is then incremented in block 738, and the 
channel number is again tested at block 736. When all n 
15 channels have been combined to form the overall simulated 
post-processed energy estimate, operation proceeds to 
block 740 of Figure 7d. 

Referring now to Figure 7d, blocks 740 through 
745 illustrate how the post-processed signal energy is 
used to generate and update the previous valley level, 
corresponding to the operation of energy valley detector 
570 of Figure 5. After all the post-processed energies 
per channel have been combined, block 740 computes the 
logarithm of this combined post-processed channel energy. 
One reason that the log representation of the 
post-processed speech energy is used in the present 
embodiment is to facilitate implementation of an 
extremely large dynamic range (> 90dB) signal in an 8-bit 
microprocessor system. 

Decision block 741 then tests to see if this log 
energy value exceeds the previous valley level. As 
previously mentioned, the previous valley level is either 
the stored valley level for the prior frame or an 
initialized valley level provided by block 701 of Figure 
35 6. If the log value exceeds the previous valley level, 
the previous valley level is updated in block 743 with 
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the current log [post-processed energy] value by 
Increasing the level with the slow time constant of 
approxiaately one. second to form a current valley level. 
This occurs when voice or a higher background noise' level 

05 is present. Conversely, if the output of decision block 
741 is negative (log [post-processed energy] less than 
previous valley level) , the previous valley level is 
t^dated in block 742 with the current log [post-processed 
energy] value by decreasing the level with a fast time 

10 constant of approxiaately 40 milliseconds to form the 
current valley level. This occurs when a lower 
background noise level is present. Accordingly, the 
background noise history is continuously updated by 
slowly increasing or rapidly decreasing the previous 

15 valley level, depending upon the background noise level 
of the ciarrent simulated post-processed speech energy 
estimate . 

After updating the previous valley- level, 
decision block 744 tests if the current log 

20 [post-processed energy] value exceeds the current valley 
level plus a predetermined offset. The addition of the 
current valley level plus this valley offset produces a 
noise threshold level. Zn the present embodiment, this 
offset provides approximately a 6 dB increase to the 

25 current valley level. Hence, another reason for 

utilizing log arithmetic is to simplify the constant 6 dB 
offset addition process. 

If the log energy exceeds this threshold — which 
would correspond to a frame of speech rather than 

30 background noise — the current backgroxind noise estimate 
is not updated, and the background noise updating process 
terminates. If, however, the log energy does not exceed 
the noise threshold level — which would correspond to a 
detected minima in the post-processed signal indicating 

35 that only noise is present — the background noise 
spectral estimate is updated in block 745. This 
corresponds to the closing of channel switch 575 in 
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response to a positive valley detect signal from energy 
valley detector 570. This updating process consists of 
providing a tioe-'averaged value of the pre-processed 
channel energy estimate for the particular channel by 

05 smoothing the estimate (in smoothing filter 580) , and 
storing these time-averaged values as per-6hannel noise 
estimates (in energy estimate storage register 585) • The 
operation of background noise estimator block 721 ends 
for the particular frame being processed by proceeding to 

10 block 729 and 630 to obtain a new frame* 

In summary, the present invention performs 
spectral subtraction noise suppression by utilizing 
post-processed speech signal to generate the background 
noise estimate « The present invention further improves 

15 the performance of these systems by utilizing overall 
average l>ackground noise to generate the noise 
suppression gain factors, and by smoothing these gain 
factors on a per-sample basis* These novel techniques 
allow the present Invention to Improve acoustic noise 

20 suppression performance in high ambient noise backgrounds 
without degrading the quality of the desired speech 
signal « 

While specific embodiments of the present 
invention have been shown and described herein, further 
25 modifications and improvemiants may be made by those 

skilled in. the art* All such modifications which retain 
the basic underlying principles disclosed and claimed 
herein are within the scope of this invention. 
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1. An iiq)roved noise suppression system for 
attenuatin? the background noise from a noisy input 
signal to produce a noise^-^suppressed output signal, said 
noise suppression system comprising: 
05 means for separating the input signal into a 

plurality of pre-processed signals representative 
of selected frequency chazmels; 

means for modifying the gain of each of said 
plurality of pre-processed signals in response to a 
10 predetermined gain value to provide a plurality of 

post-processed signals; 

means for producing said predetermined gain 
value in response to estimates of the 
signal-to-noise ratio (SKR) in each individual 
15 channel; and 

means for generating said SMR estimates in 
ea^ individual channel based upon the current 
' signal energy estimate of the . pre-processed signal 
in each Individual channel and the previous noise 
20 energy estimate of the pre-processed signal in each 

individual channel as determined by the detected 
minima of a representation of said plurality of 
post-processed signals* 
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2. The iaproved noise suppression system . 
according to claim 1, wherein said means for producing 
said predetermined gain value includes: 

a plurality of gain tables, each gain table 
having predetenained individual channel gain values 
corresponding to various individual channel SNR 
estimates; and 

gain table selection means for automatically 
selecting one of said plxiraaity of gain tables 
according to the overall average background noise 
level of said input signal. 
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3. An lioproved noise suppression system for 
attenuating the background noise from a noisy 
pre*-processed input signal to produce a noise-»suppressed 
post-processed output signal by spectral gain 
OS modification, said noise suppression system comprising: 

signal dividing meams for separating the 
pre-'processed input signal into a plurality of 
selected frequency bands, thereby producing a 
plurality of pre-processed channels; 
10 channel energy estimation meems for. 

generating an estimate of the energy in each of 
said plurality of pre-^processed channels; 

background noise estimation means for 
generating and storing estimates of the background 
15 noise energy based upon said channel energy 

estimates, and for periodically detecting the 
minima of a representation of the post-processed 
signal energy level such that said background noise 
estimates are updated only during said minima; 
20 channel SMR estimation means for generating 

an estimate of the signal -*to-nolse ratio (SNR) of 
each individual channel based upon said channel 
energy estimates and said background noise 
estimates ; 

25 channel gain controlling means for providing 

channel gain values corresponding to said channel 
SNR estimates; 

channel gain modifying means for adjusting 
the gain of each of said plurality of pre-processed 

30 channels provided by said signal dividing means 

according to said channel gain values, thereby 
producing a plurality of post-processed channels; 
and 

channel combination means for recombining 
35 said plxxrality of post-pirocessed channels to 

produce said post-processed output signal. 
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4. The improved noise suppression system 
according to claim 3, wherein said background noise 
estimation means includes means for generating said 
representation of the post-processed signal energy level 
05 by mxiltiplying said plxurality of pre--processed channels 
by said channel gain values. 

5# The improved noise suppression system 
according to claim 3, wherein said background noise 
estimation means includes: 

storage means for storing an estimate of the 

05 baclqrround noise energy of the pre-processed signal 

in each of said plurality of selected frequency 
bands as per-channel noise estimates^ and for 
continuously providing said per-channel noise 
estimates to said channel SNR estimation means; 

10 valley detection means for periodically 

detecting the minima of an overall estimate of the 
energy of said representation of post-processed 
signal in each of a plurality of selected frequency 
bands, thereby generating a valley detect signal? 

15 and 

signal controlling means coupled to said 
storage means and controlled by said valley detect 
signal for providing new background noise estimates 
to said storage means only during said minima* 



BNSOOCIO: <W0^87«n36aA1J.> 



wo 87/00366 PCT/US86/0099O 

w # 

- 35 - 

6« The iaproved noise suppression system 
according to claim 5, wherein said storage means 
includes: 

smoothing means for providing a time*averaged 
05 value of each of said background noise energy 

estimates of the pre-processed signal in a 
particular freqp&ency band; and 

memory means for storing each of said 
time-averaged values from said smoothing means as 
10 per-channel noise estimates. 

7. The improved noise suppression syst^ 
according to claim 5, wherein said valley detection means 
includes: 

meeins for storing the numerical value of the 
05 previous detected minima as a previous valley 

level; 

means for comparing the present numerical 
value of the overall -energy estimate to said 
previous valley level; 
10 means for increasing said previous valley 

level at a slow rate when said present numerical 
value is greater than said previous valley level; 
and 

means for decreasing said previous valley 
IS level at a rapid rate when said present nxamerical 

value is less than said previous valley level, 
thereby updating said previous valley level to 
provide a current valley level* 
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8. The improved noise suppress ion system 
according to claim 3, wherein said channel gain 
controlling means includes: 

a plurality of gain tables, each gain table 
05 having predetermined individual channel gain values 

corresponding, to various individual channel SNR 
estimates; 

gain table selection means for automatically 
selecting one of said plurality of gain tables 
10 according to the overall average background noise 

level of said input signal; 

whereby each individual chaxmel gain value is 
selected as a fwction of :(a) the individual 
channel number, (b j the current channel SNR 
IS estimate, and (c) the overall average background 

noise level. 



9. The improved noise suppression system 
according to claim 8, wherein said channel gain 
controlling means further includes: 

gain smoothing means for smoothing the gain 
values provided by said channel gain controlling 
means to said channel gain modifying meauis. 
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